In [1]:
from IPython.display import display, HTML

display(HTML('''
<hr>
<h3>🔗 Links</h3>
<ul>
    <li><a href="https://github.com/UIUC-iSchool-DataViz/is445_data/raw/main/ufo-scrubbed-geocoded-time-standardized-00.csv" target="_blank">📁 The Data</a></li>
    <li><a href="https://github.com/rujulkhatavkar/hw5-ufo-viz/blob/main/hw5_jekyll_page.ipynb" target="_blank">🧠 The Analysis (Notebook)</a></li>
</ul>
'''))
In [5]:
# The following code is used to create a visualization for the UFO dataset analysis.
Out[5]:

This visualization shows the number of reported UFO sightings in the United States over time, broken down by the shape of the sighting. The data is grouped by month and shape, and displayed as a line chart with each line representing a different shape category (e.g., circle, triangle, fireball). A dropdown menu is included as an interactive element, allowing the viewer to filter and explore sightings of specific shapes.

For encoding types, I used:

Temporal encoding on the x-axis for year_month, formatted as a time field (T) to display a continuous monthly timeline.

Quantitative encoding on the y-axis to represent the count of sightings.

Nominal encoding for the color, using the shape field to distinguish different UFO shapes with separate lines and a color legend.

Regarding the color design, I used Altair’s default categorical color palette to ensure each shape is clearly distinguishable. Since the shape is a nominal field with multiple categories, using color was the most intuitive way to differentiate between them. The choice supports visual clarity when comparing multiple lines in the same space.

On the data transformation side, I first converted the datetime column to a valid datetime format, then dropped rows with missing values in the shape or datetime columns. I then created a new year_month field by converting the datetime to monthly periods and grouped the data by year_month and shape, counting the number of sightings per group. This grouped data was then used to construct the line chart.

The interactivity is implemented using Altair’s selection_point with a dropdown menu bound to the shape field. This lets users explore trends for a specific UFO shape without overwhelming the chart with all lines at once. It enhances clarity and allows for focused comparison over time.

In [7]:
# UFO Sightings in the US with Duration Filter
Out[7]:

This visualization shows the trend of UFO sightings over time, grouped by shape and aggregated monthly. It leverages a dropdown menu to let users select a specific UFO shape and observe how its frequency changed over time. Unlike traditional legends, the dropdown keeps the view uncluttered and helps focus analysis on one shape at a time.

In terms of encoding, the chart uses:

Temporal encoding on the x-axis (year_month:T) to display a continuous time scale by month.

Quantitative encoding on the y-axis (count:Q) to show the number of sightings per shape.

Color encoding (color='shape:N') is used to visually differentiate shapes when multiple are visible, although interactivity usually keeps only one visible at a time.

I used Altair’s default color scheme for categorical variables, which is suitable for distinguishing nominal data like UFO shape types. This avoids visual ambiguity and makes shape comparisons clear when multiple lines are shown.

From a data transformation standpoint, I started by cleaning the dataset—converting the datetime column to a valid timestamp and dropping rows with missing shape or date values. I created a new year_month column (monthly aggregation), and then used a groupby operation on both year_month and shape to count the number of sightings for each shape per month. The grouped data was formatted to support Altair’s time-based plotting functions.

The interactivity here is implemented with a dropdown selector bound to the shape field, using Altair's selection_point. Setting empty='all' ensures that all shapes are shown by default, and filtering only occurs when the user selects a shape. This approach avoids overwhelming the user with too many overlapping lines and enhances clarity when analyzing individual trends.

In [ ]: